AITopics | privacy loss

Poisson subsampling is the default sampling scheme in differentially private machine learning, largely because its unstructured randomness yields tractable privacy amplification analyses. Yet this same randomness introduces substantial participation variance: each sample appears in very different numbers of training iterations. In this work, we show that this variance is not merely a practical artifact to be tolerated, but a fundamental source of suboptimal privacy amplification. We prove that Balanced Iteration Subsampling (BIS), a structured scheme in which each sample participates in exactly a fixed number of iterations, achieves stronger privacy amplification than Poisson subsampling and is optimal at both extremes of the noise spectrum ($σ\to 0$ and $σ\to \infty$). Our analysis reveals that the privacy-noise tradeoff is governed not by maximizing randomness, but by eliminating participation variance while preserving uniform marginal participation across iterations. To translate this asymptotic theory into finite-noise guarantees, we introduce a practical near-exact Monte Carlo accountant for BIS, which removes the analytical slack of existing RDP and composition-based PLD analyses. Evaluations across more than 60 practical DP-SGD configurations show that BIS consistently outperforms Poisson subsampling in the low-noise regimes most relevant for high-utility private training, reducing the required noise multiplier by up to $9.6\%$. These results overturn the common intuition that more sampling randomness necessarily yields stronger privacy amplification: in DP-SGD, structured participation can be both more practical and more private. Our implementation is available at https://github.com/dong-xin-ao-andy/bis-mc-accountant.

artificial intelligence, machine learning, poisson, (13 more...)

arXiv.org Machine Learning

2605.07072

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Security & Privacy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

4c27cea8526af8cfee3be5e183ac9605-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 19:14:00 GMT

agent, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

37ecd27608480aa3569a511a638ca74f-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 11:39:15 GMT

artificial intelligence, inductive learning, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.94)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

1165af8b913fb836c6280b42d6e0084f-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 00:30:16 GMT

artificial intelligence, machine learning, privacy, (16 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

18561617ca0b4ffa293166b3186e04b0-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 20:34:13 GMT

artificial intelligence, machine learning, privacy amplification, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

Neural Information Processing SystemsApr-24-2026, 08:56:05 GMT

Prior work on differential privacy analysis of randomized SGD algorithms relies on composition theorems, where the implicit (unrealistic) assumption is that the internal state of the iterative algorithm is revealed to the adversary. As a result, the Rényi DP bounds derived by such composition-based analyses linearly grow with the number of training epochs. When the internal state of the algorithm is hidden, we prove a converging privacy bound for noisy stochastic gradient descent (on strongly convex smooth loss functions). We show how to take advantage of privacy amplification by sub-sampling and randomized post-processing, and prove the dynamics of privacy bound for "shuffle and partition" and "sample without replacement" stochastic mini-batch gradient descent schemes. We prove that, in these settings, our privacy bound converges exponentially fast and is substantially smaller than the composition bounds, notably after a few number of training epochs. Thus, unless the DP algorithm converges fast, our privacy analysis shows that hidden state analysis can significantly amplify differential privacy.

artificial intelligence, machine learning, privacy amplification, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.30)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.92)

Add feedback